9 research outputs found

    The Libra Toolkit for Probabilistic Models

    Full text link
    The Libra Toolkit is a collection of algorithms for learning and inference with discrete probabilistic models, including Bayesian networks, Markov networks, dependency networks, and sum-product networks. Compared to other toolkits, Libra places a greater emphasis on learning the structure of tractable models in which exact inference is efficient. It also includes a variety of algorithms for learning graphical models in which inference is potentially intractable, and for performing exact and approximate inference. Libra is released under a 2-clause BSD license to encourage broad use in academia and industry

    Differential Equation Units: Learning Functional Forms of Activation Functions from Data

    Full text link
    Most deep neural networks use simple, fixed activation functions, such as sigmoids or rectified linear units, regardless of domain or network structure. We introduce differential equation units (DEUs), an improvement to modern neural networks, which enables each neuron to learn a particular nonlinear activation function from a family of solutions to an ordinary differential equation. Specifically, each neuron may change its functional form during training based on the behavior of the other parts of the network. We show that using neurons with DEU activation functions results in a more compact network capable of achieving comparable, if not superior, performance when is compared to much larger networks.Comment: arXiv admin note: text overlap with arXiv:1905.0768

    Learning Tractable Graphical Models

    No full text
    Probabilistic graphical models have been successfully applied to a wide variety of fields such as computer vision, natural language processing, robotics, and many more. However, for large scale problems represented using unrestricted probabilistic graphical models, exact inference is often intractable, which means that the model cannot compute the correct value of a joint probability query in a reasonable time. In general, approximate inference has been used to address this intractability, in which the exact joint probability is approximated. An increasingly popular alternative is tractable models. These models are constrained such that exact inference is efficient. To offer efficient exact inference, tractable models either benefit from graph-theoretic properties, such as bounded treewidth, or structural properties such as local structures, determinism, or symmetry. An appealing group of probabilistic models that capture local structures and determinism includes arithmetic circuits (ACs) and sum-product networks (SPNs), in which marginal and conditional queries can be answered efficiently. In this dissertation, we describe ID-SPN, a state-of-the-art SPN learner as well as novel methods for learning tractable graphical models in a discriminative setting, in particular through introducing Generalized ACs, which combines ACs and neural networks. Using extensive experiments, we show that the proposed methods often achieves better performance comparing to selected baselines. This dissertation includes previously published and unpublished co-authored material.10000-01-0

    Discriminative Structure Learning of Arithmetic Circuits

    No full text
    The biggest limitation of probabilistic graphical models is the complexity of inference, which is often intractable. An appealing alternative is to use tractable probabilistic models, such as arithmetic circuits (ACs) and sum-product networks (SPNs), in which marginal and conditional queries can be answered efficiently. In this paper, we present the first discriminative structure learning algorithm for ACs, DACLearn (Discriminative AC Learner), which optimizes conditional log-likelihood. Based on our experiments, DACLearn learns models that are more accurate and compact than other tractable generative and discriminative baselines

    Learning Markov Networks With Arithmetic Circuits

    No full text
    Markov networks are an effective way to represent complex probability distributions. However, learning their structure and parameters or using them to answer queries is typically intractable. One approach to making learning and inference tractable is to use approximations, such as pseudo-likelihood or approximate inference. An alternate approach is to use a restricted class of models where exact inference is always efficient. Previous work has explored low treewidth models, models with tree-structured features, and latent variable models. In this paper, we introduce ACMN, the first ever method for learning efficient Markov networks with arbitrary conjunctive features. The secret to ACMN’s greater flexibility is its use of arithmetic circuits, a lineartime inference representation that can handle many high treewidth models by exploiting local structure. ACMN uses the size of the corresponding arithmetic circuit as a learning bias, allowing it to trade off accuracy and inference complexity. In experiments on 12 standard datasets, the tractable models learned by ACMN are more accurate than both tractable models learned by other algorithms and approximate inference in intractable models.
    corecore